- Monday, September 2, 2024
After pausing it due to issues with historically inaccurate images, Google has reintroduced the image generation feature in its Gemini AI chatbot. Using brief text descriptions, the updated tool can create diverse types of images, from photorealistic landscapes to oil paintings. Enhanced safeguards preventing users from creating pictures of public figures, minors, and explicit scenes are now in place. Feedback from early users will guide further improvements.
- Tuesday, June 25, 2024
The Google AI JavaScript SDK allows developers to utilize Google's generative AI models like Gemini for various applications, such as text and image generation, and chat interactions.
- Thursday, August 29, 2024
Google is rolling out its new features, Gems and Imagen 3, to Gemini Advanced subscribers. Gems allows users to create custom versions of Gemini tailored to specific tasks, offering premade options like learning coaches and coding partners, while Imagen 3 is Google's latest image generation model, now available to generate detailed and artistic images.
- Tuesday, September 10, 2024
Google's AI Overviews, powered by the Gemini language model, faced heavy criticism for inaccuracies and dangerous suggestions after its U.S. launch. Despite the backlash, Google expanded the feature to six more countries, raising concerns among publishers about reduced traffic and misrepresented content. AI strategists and SEO experts emphasize the need for transparency and better citation practices to maintain trust and traffic.
- Wednesday, June 26, 2024
Google is reportedly working on an AI project that lets users talk to chatbots modeled after celebrities, YouTube influencers, and other personalities. The feature will allow anyone to create chatbots by describing their personality and appearance and then converse with them. It involves fine-tuning the Gemini family of language models to mimic the response style of the described character. The project may eventually be integrated into YouTube rather than launched as a standalone product.
- Thursday, April 11, 2024
Google has made its most advanced generative AI model, Gemini 1.5 Pro, available in public preview on its Vertex AI platform. It offers a large context window of up to 1 million tokens.
- Monday, August 19, 2024
Google has released Imagen 3, an advanced AI text-to-image generator, through its AI Test Kitchen and the Vertex AI platform. Imagen 3 is designed to produce images with enhanced details and lighting while maintaining content guardrails to prevent the creation of certain restricted imagery. Despite limitations, users have found ways to depict characters and logos resembling copyrighted material.
- Monday, March 4, 2024
Google's recent AI debacle with its Gemini image and text generation tool has sparked an internal crisis, with plummeting morale and calls for CEO Sundar Pichai's resignation. While the public focuses on potential bias in the tool, the root cause lies in organizational chaos: unclear ownership of Gemini's development and a lack of accountability within Google teams. This dysfunction led to hasty decision-making and insufficient coordination, resulting in the tool's problematic outputs.
- Tuesday, August 20, 2024
Gemini Live, Google's AI-powered voice interaction system, mimics natural conversation but struggles with hallucinations and inaccuracies. Despite incorporating professional actors for more expressive voices, it lacks the customizability and expressiveness found in competitors like OpenAI's Advanced Voice Mode. Overall, the bot's reliability issues and limited functionality make its practicality and purpose unclear, particularly as part of Google's paid AI Premium Plan.
- Tuesday, October 1, 2024
Meta has introduced an AI image generator that integrates with its messaging platforms, including WhatsApp, Facebook, and Instagram. This feature allows users to create images through text prompts, similar to existing AI art generators. During a demonstration at Meta Connect, the author tested the capabilities of this new tool using a Samsung Galaxy S22. The AI, powered by Meta's Llama 3.2 model, is designed to interpret both images and text within a conversational context. The experience highlighted some common challenges faced by users of AI art generators. The author initially requested an image of a family dressed as characters from Sonic the Hedgehog, but the output did not match the envisioned concept. Instead of a family, the AI produced a more generic depiction of a family in costumes, which included a male child. Frustrated, the author attempted to refine the image by changing the background to a space theme and asking for the characters to be depicted as flying. This led to a final image that diverged significantly from the original request, showcasing a man in a Sonic costume alongside three children. The demonstration underscored the importance of specificity when interacting with AI tools. The author reflected on the learning curve associated with using AI, noting that clearer and more detailed prompts might yield better results. As Meta rolls out this feature, it is expected that users will increasingly share AI-generated images on their social media feeds, contributing to the growing presence of AI in everyday digital interactions. In addition to the image generator, Meta has been making headlines for other AI advancements, including the ability to create deepfake videos and engage users in conversations using celebrity voices. These developments are part of a broader trend where AI technologies are becoming more integrated into various aspects of social media and communication.
- Friday, September 13, 2024
Google is rolling out Gemini Live, its conversational AI feature, to free Android users after a month of advanced user access. Users can interrupt responses with new information and receive text transcripts of their interactions. Although Gemini Live doesn't support extensions like Gmail yet, it offers ten new voice options, and more features are promised soon.
- Friday, July 12, 2024
Google's Gemini 1.5 Pro AI can train robots to navigate and complete tasks using video tours and natural language instructions, achieving a 90% success rate and demonstrating advanced planning capabilities.
- Thursday, April 11, 2024
The recent Google Cloud Next keynote showcased Google’s vast, scalable AI infrastructure, which is difficult to replicate. Gemini models can now be integrated with Google Search and enterprise data sources, reducing hallucinations and creating better answers. Gemini 1.5 Pro can process up to 1 million tokens of information, which is a breakthrough in long context understanding. Google is integrating Gemini into many of its enterprise products, such as Google Workspace and code development tools, which is another competitive advantage for them.
- Tuesday, September 3, 2024
Stack Overflow has banned the use of generative AI tools like ChatGPT for creating content on the platform due to the high rate of incorrect answers produced by these tools.
- Wednesday, April 10, 2024
The new version of Google's enhanced image-generating tool, Imagen 2, has been launched inside the Vertex AI developer platform. The model family can create and edit images with text prompts, similar to OpenAI's DALL-E and Midjourney. With Imagen 2, users can now remove unwanted parts of an image, add new elements, and expand an image's borders to create a wider field of view. As part of the upgrade, users can create "text-to-live images," which are short, four-second videos from text prompts.
- Thursday, May 16, 2024
Google is incorporating AI into its search engine with features like "AI Overviews" and a new planning tool. It aims to deliver summarized answers, automatic categorization, and personalized results using its Gemini AI to understand queries, generate summaries, and design result pages. For users, this could mean a completely new way to interact with the internet: less typing, fewer tabs, and more chatting with a search engine.
- Wednesday, May 15, 2024
Google introduced upgraded versions of its Gemini AI model, including a public preview of Gemini 1.5 Pro with a 2 million token context window at I/O 2024. For Android developers, the company highlighted Gemini Nano, an on-device AI model for faster local processing. Flutter and Dart also received updates, with Flutter now supporting compilation to WebAssembly and Dart introducing the initial stages of macros.
- Wednesday, May 22, 2024
Elon Musk's AI company, xAI, is advancing its Grok chatbot to support multimodal inputs, allowing users to upload photos and receive text-based answers.
- Monday, April 29, 2024
Apple has renewed discussions with OpenAI to use its generative AI technology to power new features coming to the iPhone later this year. The companies were discussing a deal earlier this year, but work between the two parties has been minimal since then. Apple is still in discussions with Google about licensing the Gemini chatbot. It is unclear which provider Apple plans to eventually use.
- Wednesday, September 18, 2024
Google plans to enhance Google Search by flagging AI-generated images in the "About this image" window. This feature relies on C2PA metadata to identify AI-manipulated photos, but its adoption faces challenges since only a few tools and cameras support these standards. While companies like Google, Amazon, and Adobe back C2PA, metadata can still be removed or corrupted. Despite these limitations, such measures aim to combat the rapid spread of deepfakes.
- Monday, August 19, 2024
Following a limited release to select Vertex AI users, Google's advanced text-to-image AI model, Imagen 3, is now available to all U.S. users through the ImageFX platform. This quiet expansion has sparked mixed reactions, with some praising the tool's improved texture and word recognition capabilities while others criticize its strict content filters.
- Monday, September 30, 2024
Google has recently implemented another redesign of the Gemini app's homescreen for Android, aiming for a more streamlined and minimalist user experience. This update follows earlier changes made to the app and reflects a shift towards simplicity in its interface. In the previous version, the homescreen featured a text box at the bottom that clearly indicated how users could interact with Gemini, including options to type, talk, or share a photo. The interface included buttons for voice and camera input, along with a plus button for uploads and a Gemini Live option positioned at opposite corners of the screen. The new design condenses these elements, placing the prompt on the same line as four functional buttons, similar to the layout seen during conversations. This change aligns the mobile interface more closely with the web version of Gemini. When users tap the text field, it expands to reveal the older interface, accompanied by a smooth animation as the keyboard appears. Closing the text field triggers a reverse animation, enhancing the overall user experience. The updated homescreen now greets users with a simple "Hello, [name]" message, and the Chats & Gems history is conveniently located in the top-left corner. This minimalist approach is reminiscent of the classic Google Search homepage, emphasizing ease of use and accessibility. For users who do not see the redesign immediately, a simple solution is to force stop the Gemini app and relaunch it to access the new features. In addition to the redesign, there are ongoing developments related to Gemini, including its integration with other Google products and enhancements to notifications for devices like the Pixel Buds Pro 2. The Gemini app continues to evolve, reflecting Google's commitment to improving user interaction across its platforms.
- Thursday, April 4, 2024
OpenAI's DALL-E now offers image editing tools both on the web and on mobile. There are preset style suggestions to help inspire image creation. The image generation platform has been integrated with ChatGPT - users can now edit DALL-E images in ChatGPT across web, iOS, and Android. Videos from OpenAI showing off the new features are available in the article.
- Thursday, August 8, 2024
Google announced new Gemini AI-powered features for Google Home, including intelligent captions for Nest camera footage, natural language processing for Home routine creation, and an upgraded, more natural-sounding Google Assistant. The advanced features, primarily behind a Nest Aware subscription paywall, aim to enhance the smart home experience with the rollout commencing in beta and expanding next year. As part of the move toward smarter home automation, Google envisions an assistant that can proactively manage complex and dynamic home environments.
- Tuesday, July 23, 2024
Figma briefly rolled back its 'Make Designs' AI feature from its limited beta after realizing it generated UI designs that too closely resembled existing apps. The feature uses off-the-shelf AI models like GPT-4 and Amazon's Titan and it needs refinement to ensure originality. Figma aims to re-enable the feature with improved QA processes to better assist designers in leveraging AI for efficient design creation.
- Wednesday, May 15, 2024
Google has enhanced its AI content watermarking technology, SynthID, to now mark AI-generated text and video. This technology is crucial for distinguishing between AI-created and authentic content to prevent misuse, such as spreading misinformation or generating nonconsensual media.
- Friday, July 5, 2024
WhatsApp is developing an AI avatar generator called "Imagine" which uses user-supplied images, text prompts, and Meta's AI Llama model to create avatars. Reference images can be deleted at any time. There's no release date for the feature yet.
- Tuesday, June 11, 2024
AI image generation has progressed from creating images based on text descriptions since 2022. Using a child's game analogy, this article explains how these models refine noisy inputs to produce detailed and specific images, showcasing the rapid advancements and potential of AI in visual creativity.
- Thursday, May 16, 2024
Google is rolling out its AI Overviews search to U.S.-based users of Google Search. AI Overviews gives answers to queries using generative AI technology powered by Google Gemini. It provides a few snippets of an answer based on its understanding of queries and the content it found on the topic across the web.
- Friday, June 28, 2024
Instagram's new "AI Studio" lets creators develop AI chatbot versions of themselves. It is currently rolling out as an early test in the US.